翻訳と辞書
Words near each other
・ "O" Is for Outlaw
・ "O"-Jung.Ban.Hap.
・ "Ode-to-Napoleon" hexachord
・ "Oh Yeah!" Live
・ "Our Contemporary" regional art exhibition (Leningrad, 1975)
・ "P" Is for Peril
・ "Pimpernel" Smith
・ "Polish death camp" controversy
・ "Pro knigi" ("About books")
・ "Prosopa" Greek Television Awards
・ "Pussy Cats" Starring the Walkmen
・ "Q" Is for Quarry
・ "R" Is for Ricochet
・ "R" The King (2016 film)
・ "Rags" Ragland
・ ! (album)
・ ! (disambiguation)
・ !!
・ !!!
・ !!! (album)
・ !!Destroy-Oh-Boy!!
・ !Action Pact!
・ !Arriba! La Pachanga
・ !Hero
・ !Hero (album)
・ !Kung language
・ !Oka Tokat
・ !PAUS3
・ !T.O.O.H.!
・ !Women Art Revolution


Dictionary Lists
翻訳と辞書 辞書検索 [ 開発暫定版 ]
スポンサード リンク

exploratory data analysis : ウィキペディア英語版
exploratory data analysis

In statistics, exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task. Exploratory data analysis was promoted by John Tukey to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments. EDA is different from initial data analysis (IDA), which focuses more narrowly on checking assumptions required for model fitting and hypothesis testing, and handling missing values and making transformations of variables as needed. EDA encompasses IDA.
==Overview==

Tukey defined data analysis in 1961 as: "()rocedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data."〔(John Tukey-The Future of Data Analysis-July 1961 )〕
Tukey's championing of EDA encouraged the development of statistical computing packages, especially ''S'' at Bell Labs. The ''S'' programming language inspired the systems 'S'-PLUS and ''R''. This family of statistical-computing environments featured vastly improved dynamic visualization capabilities, which allowed statisticians to identify outliers, trends and patterns in data that merited further study.
Tukey's EDA was related to two other developments in statistical theory: Robust statistics and nonparametric statistics, both of which tried to reduce the sensitivity of statistical inferences to errors in formulating statistical models. Tukey promoted the use of five number summary of numerical data—the two extremes (maximum and minimum), the median, and the quartiles—because these median and quartiles, being functions of the empirical distribution are defined for all distributions, unlike the mean and standard deviation; moreover, the quartiles and median are more robust to skewed or heavy-tailed distributions than traditional summaries (the mean and standard deviation). The packages ''S'', ''S''-PLUS, and ''R'' included routines using resampling statistics, such as Quenouille and Tukey's jackknife and Efron bootstrap, which are nonparametric and robust (for many problems).
Exploratory data analysis, robust statistics, nonparametric statistics, and the development of statistical programming languages facilitated statisticians' work on scientific and engineering problems. Such problems included the fabrication of semiconductors and the understanding of communications networks, which concerned Bell Labs. These statistical developments, all championed by Tukey, were designed to complement the analytic theory of testing statistical hypotheses, particularly the Laplacian tradition's emphasis on exponential families.

抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)
ウィキペディアで「exploratory data analysis」の詳細全文を読む



スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース

Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.